S.E.E. Food, a Shazam for Food.
Inspiration
I recently rewatched one of my favorite shows: HBO's hit comedy Silicon Valley. Probably one of the most memorable b-plots from Season 4 was Jian Yang's "Shazam for Food" app.
![]()
I figured with the fast.ai image classification library, built on State of the Art neural networks, we might be able to fare a bit better than young Jinnathan and recognize more food than hot dogs and not hot dogs.
Fast.AI: No Math? No Problem.
Despite what the name might have you believe, the best thing about fast.ai is actually its fastness. And also its AI. The core philosophy of this hip new online deep learning course is hands on, practical examples before the a dump of (frankly) incomprehensible math.

I've always been a kinetic learner. I need to touch and play with things in order to really understand them. Books and lectures put me right to sleep. Needless to say, this approach of showing the end result first really appealed to me.
That's also to say, I kind of don't know what I'm doing here. So bear with me as I attempt to explain this totally-not-copied-from-the-lecture-notes Jupyter notebook. Also, we like to play things fast and loose around here. If you're a kid, I'd suggest you go back to watching Arthur or something. There are some bad words in this motherfucker.
That's a "Bing"-o!...is that the expression, that's a bingo?
So it turns out that Bing is more than just the place you go to look up how to change your default search engine to Google.
![]()
haha. But all due respect to Microsoft, I was MacroHard when I saw their image search offerings. It really saves me the hassle of having to scrape the internet for thousands of food pictures. Time to put away those Puppeteer scripts.
key = os.environ.get('AZURE_SEARCH_KEY', 'XXXXXXXXXXXXXXXXXXXXXXXXXXX')
search_images_bing
results = search_images_bing(key, 'hot dog')
ims = results.attrgot('content_url')
len(ims)
print(ims)
Not Hot Dog
Unlike the TV counterpart, our SeeFood is going to support more than just hotdogs. I've got some omelette, rice, fried chicken and hamburger, in addition to pizza and spaghetti, the two foods that they actually tried in the show.
I'm just now realizing octopus would've been a good addition.
Anyways, in the code below, we're gonna loop through our food types and use Bing's image search to download about 150 photos per type. We'll also get rid of all the images that are invalid. This takes a long ass time
types = 'hot dog','pizza', 'omelette', 'rice', 'spaghetti', 'fried chicken', 'hamburger'
path = Path('food')
if not path.exists():
path.mkdir()
for o in types:
dest = (path/o)
dest.mkdir(exist_ok=True)
results = search_images_bing(key, f'{o}')
download_images(dest, urls=results.attrgot('content_url'))
fns = get_image_files(path)
failed = verify_images(fns)
failed.map(Path.unlink);
Big Boy Machine Learning.
Time for the secret sauce. Let's teach this bitch computer to learn. But before we do that, we gotta set up some Fast AI specific data structures for fetching our images, splitting up our training and validation sets, assigning labels, and resizing our iamges. The DataBlock class does exactly that.
We've also gotta do some data augmentation. This helps the model performance by basically zooming in on different parts of the picture after each epoch. I just picture a geriatric robot looking at a picture close up and moving it around to really study the heck out of it.

db = DataBlock(
blocks=(ImageBlock, CategoryBlock),
get_items=get_image_files,
splitter=RandomSplitter(valid_pct=0.2, seed=42),
get_y=parent_label,
item_tfms=Resize(128))
dls = db.dataloaders(path)
db = db.new(
item_tfms=RandomResizedCrop(224, min_scale=0.3),
batch_tfms=aug_transforms())
dls = db.dataloaders(path)
learn = cnn_learner(dls, resnet18, metrics=error_rate)
learn.fine_tune(4)
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix()
interp.plot_top_losses(5, nrows=1)
Hamburger Cheeseburger Big Mac Whopper
Quotes from the underground:
"Who deleted my spaghetti?"
"A chicken is not an omelette....oh shit, is it? 🤔 "
The model's not bad, about a 90% success rate. So we got some crappy images in our training and validation set. Some wise ass decided to upload pictures of hamburgers with pizzas on top and file them under "hamburger". Can you really blame the bot?
But there are some other issues. I probably should have specified "rice food". The model used images of Condoleeza Rice as part of the training set.

I am not a food (Condoleeza Rice, in a speech to the U.N. General Assembly)
So we gotta clean this data up. Gotta really scrub it nice. Get between the a**cheeks. I really hope nobody ever reads this
cleaner = ImageClassifierCleaner(learn)
cleaner
LGTM. Ship it.
My dudes. We have just built See Food the way it was meant to be built. I'll be taking that half a mil and palapa now.
In all seriousness, this whole process was refreshingly free of Sigmoid functions and partial derivatives. You'll notice the hard ML stuff was 2 lines of pretty short code. That's it! That's all it takes. Fuck was I even doing before amirite?
So now what? We have this cool model, but how do we get it into the hands of people to use? Well, look here son. We can actually export our model to a Pickle file. (It's the funniest shit I've ever seen!).

Cool thing is, we can actually unpack this file and run our prediction code without needing a GPU. So we can chuck this Pickle into the cloud, on CPU instances and start sending our requests there. But for now, we're going to just use some tools (Voila and Binder) to host our Jupyter notebook on Github pages for people to try out.
img = PILImage.create(btn_upload.data[-1])
out_pl = widgets.Output()
lbl_pred = widgets.Label()
btn_upload = widgets.FileUpload()
btn_run = widgets.Button(description='Classify')
def on_click_classify(change):
img = PILImage.create(btn_upload.data[-1])
out_pl.clear_output()
with out_pl: display(img.to_thumb(128,128))
pred,pred_idx,probs = learn_inf.predict(img)
lbl_pred.value = f'Prediction: {pred}; Probability: {probs[pred_idx]:.04f}'
btn_run.on_click(on_click_classify)
VBox([widgets.Label('shazam for food'),
btn_upload, btn_run, out_pl, lbl_pred])
The end
So there you have it. A nice little hands on experiment for training and deploying an image classifier. The memes probably haven't aged well for those of you reading this in the future. And for those of you reading this in the past, please teach me how you traveled back in time so I can buy Tesla calls.
Also, if you want to see the end result
This truly is a special occasion. Thanks for reading. Like comment and subscribe. And click the bell to turn on notifications.

(Seriously, my Twitter and Github are below if you want to talk to me. Please be patient though, because I am very introverted).